9 research outputs found

    Probabilistic Relational Model Benchmark Generation

    Get PDF
    The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic dependencies among the attributes. We are particularly interested in Probabilistic Relational Models (PRMs), which extend Bayesian Networks (BNs) to a relational data mining context and enable effective and robust reasoning over relational data. Even though a panoply of works have focused, separately , on the generation of random Bayesian networks and relational databases, no work has been identified for PRMs on that track. This paper provides an algorithmic approach for generating random PRMs from scratch to fill this gap. The proposed method allows to generate PRMs as well as synthetic relational data from a randomly generated relational schema and a random set of probabilistic dependencies. This can be of interest not only for machine learning researchers to evaluate their proposals in a common framework, but also for databases designers to evaluate the effectiveness of the components of a database management system

    Les modèles probabilistes relationnels : apprentissage et évaluation

    No full text
    Statistical relational learning (SRL) appeared in the early 2000s as a new field of machine learning that enables effective and robust reasoning about relational data structures. Several conventional data mining methods have been adapted for direct application to relational data representation. Relational Bayesian Networks (RBNs) extend Bayesian networks (BNs) to a relational data mining context. To use this model, it is first necessary to build it: the structure and parameters of a RBN must be set manually or learned from a relational observational dataset. Learning the structure remains the most complicated issue as it is a NP-hard problem. Existing approaches for RBNs structure learning are inspired from classical methods of learning the structure of BNs. The evaluation of learning approaches requires testing datasets and evaluation measurements. For BNs, datasets are usually sampled from real known networks. Otherwise, processes to randomly generate the model and the data are already established. Both practices are almost absent for RBR. Moreover, metrics to evaluate a RBN structure learning algorithm are not yet proposed.This thesis provides two major contributions. I) A synthetic approach allowing to generate random RBNs from scratch. The proposed method allows to generate RBNs as well as synthetic relational data from a randomly generated relational schema and a random set of probabilistic dependencies. Also, we discuss the adaptation of the evaluation metrics of BNs structure learning algorithms to the relational context and we propose new relational evaluation measurements. II) A hybrid approach for RBNs structure learning. This approach presents an extension of the MMHC algorithm in the relational context. We present an experimental study to compare this new learning algorithm with the state-of-the-art approaches.L'apprentissage statistique relationnel est apparu au début des années 2000 comme un nouveau domaine de l'apprentissage machine permettant de raisonner d'une manière efficace et robuste directement sur des structures de données relationnelles. Plusieurs méthodes classiques de fouille de données ont été adaptées pour application directe sur des données relationnelles. Les réseaux Bayésiens Relationnels (RBR) présentent une extension des réseaux Bayésiens (RB) dans ce contexte. Pour se servir de ce modèle, il faut tout d'abord le construire : la structure et les paramètres du RBR doivent être définis à la main ou être appris à partir d'une instance de base de données relationnelle. L'apprentissage de la structure reste toujours le problème le plus compliqué puisqu'il se situe dans la classe des problèmes NP-difficiles. Les méthodes d'apprentissage de la structure des RBR existantes sont inspirées des méthodes classique de l'apprentissage de la structure des RB. Pour pouvoir juger la qualité d'un algorithme d'apprentissage de la structure d'un RBR, il faut avoir des données de test et des mesures d'évaluation. Pour les RB les données sont souvent issues de benchmarks existants. Sinon, des processus de génération aléatoire du modèle et des données sont mis en oeuvre. Les deux pratiques sont quasi absentes pour les RBR. De plus, les mesures d'évaluation de la qualité d'un algorithme d 'apprentissage de la structure d'un RBR ne sont pas encore établies.Dans ce travail de thèse, nous proposons deux contributions majeures. I)Une approche de génération de RBR allant de la génération du schéma relationnel, de la structure de dépendance et des tables de probabilités à l'instanciation de ce modèle et la population d'une base de données relationnelle. Nous discutons aussi de l'adaptation des mesures d'évaluation des algorithmes d'apprentissage de RBs dans le contexte relationnel et nous proposons de nouvelles mesures d'évaluation. II) Une approche hybride pour l'apprentissage de la structure des RBR. Cette approche présente une extension de l'algorithme MMHC dans le contexte relationnel. Nous menons une étude expérimentale permettant de comparer ce nouvel algorithme d'apprentissage avec les approches déjà existantes

    Probabilistic relational model benchmark generation: Principle and application

    No full text
    International audienc

    A RBN-based recommender system architecture

    No full text
    International audienceWith the widespread use of Internet, recommender systems are becoming increasingly adapted to resolve the problem of information overload and to deal with large amount of on line information. Several approaches and techniques have been proposed to implement recommender systems. Most of them rely on flat data representation while most real world data are stored in relational databases. This paper proposes a new recommendation approach that explores the relational nature of the data in hand using relational Bayesian networks (RBNs)

    A hybrid approach for probabilistic relational models structure learning

    No full text
    International audienceProbabilistic relational models (PRMs) extend Bayesian networks (BNs) to a relational data mining context. Just like BNs, the structure and parameters of a PRM must be either set by an expert or learned from data. Learning the structure remains the most complicated issue as it is a NP-hard problem. Existing approaches for PRM structure learning are inspired from classical methods of learning the BN structure. Extensions for the constraint-based and score-based methods have been proposed. However, hybrid methods are not yet adapted to relational domains, although some of them show better experimental performance, in the classical context, than constraint-based and score-based methods, such as the Max-Min Hill Climbing (MMHC) algorithm. In this paper, we present an adaptation of this latter to relational domains and we made an empirical evaluation of our algorithm. We provide an experimental study where we compare our new approach to the state-of-the art relational structure learning algorithms

    Ontology-based generation of object oriented bayesian networks

    No full text
    International audienceProbabilistic Graphical Models (PGMs) are powerful tools for representing and reasoning under uncertainty. Although useful in several domains, PGMs suffer from their building phase known to be mostly an NP-hard problem which can limit in some extent their application, especially in real world applications. Ontologies, from their side, provide a body of structured knowledge characterized by its semantic richness. This paper proposes to harness ontologies representation capabilities in order to enrich the process of PGMs building. We are in particular interested in object oriented Bayesian networks (OOBNs) which are an extension of standard Bayesian networks (BNs) using the object paradigm. We show how the semantical richness of on-tologies might be a potential solution to address the challenging field of structural learning of OOBNs while minimizing experts involvement which is not always obvious to obtain. More precisely, we propose to set up a set of mapping rules allowing us to generate a prior OOBN structure by morphing an ontology related to the problem under study to be used as a starting point to the global OOBN building algorithm
    corecore